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ABSTRACT 

This  paper  presents  a  novel  approach  to  bug-finding  anal¬ 
ysis  and  an  implementation  of  that  approach.  Our  goal  is 
to  find  as  many  serious  bugs  as  possible.  To  do  so,  we  de¬ 
signed  a  flexible,  easy-to-use  extension  language  for  speci¬ 
fying  analyses  and  an  efficent  algorithm  for  executing  these 
extensions.  The  language,  metal,  allows  the  users  of  our 
system  to  specify  a  broad  class  of  analyses  in  terms  that  re¬ 
semble  the  intuitive  description  of  the  rules  that  they  check. 
The  system,  xgcc,  executes  these  analyses  efficiently  using  a 
context-sensitive,  interprocedural  analysis. 

Our  prior  work  has  shown  that  the  approach  described 
in  this  paper  is  effective:  it  has  successfully  found  thousands 
of  bugs  in  real  systems  code.  This  paper  describes  the  un¬ 
derlying  system  used  to  achieve  these  results.  We  believe 
that  our  system  is  an  effective  framework  for  deploying  new 
bug-finding  analyses  quickly  and  easily. 
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1.  INTRODUCTION 

This  paper  describes  the  implementation  of  an  unusual 
approach  to  finding  bugs  that  we  call  metacompilation 
(MC),  The  focus  of  our  approach  is  pragmatism:  we  want 
to  find  as  many  serious  bugs  as  possible.  We  do  so  using 
programmer- written  compiler  extensions  (checkers).  This 
paper  presents  a  language,  metal,  for  implementing  these 
extensions,  and  an  analysis  engine,  xgcc,  that  executes  ex¬ 
tensions  using  a  context-sensitive,  interprocedural  analysis. 
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The  main  barrier  to  finding  bugs  is  simply  knowing  the 
correctness  rules  that  code  must  obey.  The  more  rules  you 
can  check,  the  more  bugs  you  will  find.  Thus,  we  designed 
metal  to  be  (1)  easy  to  use  and  (2)  flexible  enough  to  ex¬ 
press  a  broad  range  of  rules  within  a  unified  framework. 
Metal  must  be  easy  to  use  since  many  rules  are  known  only 
to  programmers;  if  they  cannot  write  extensions,  we  can¬ 
not  check  these  rules.  Thus,  metal  is  designed  for  system 
implementers,  not  compiler  writers.  Metal  must  be  flexible 
because  we  want  to  check  arbitrary  rules.  We  do  not  want 
a  system  that  is  limited  to  checking  a  specific  set  of  proper¬ 
ties  (e.g.,  synchronization  constraints;  temporal  rules)  or  a 
specific  underlying  assumption  (e.g.,  ‘‘the  analysis  must  be  V, 
conservative”). 

Metal  is  easy  to  use  because  it  provides  the  state  mar 
chine  (SM)  as  a  fundamental  abstraction.  State  machines 
are  an  easy  abstraction  because  they  are  a  familiar  concept 
in  systems  programming.  Metal  is  flexible  because  it  allows 
the  extension  writer  to  enhance  the  SM  abstraction  in  near- 
arbitrary  ways  with  general-purpose  code.  MetaFs  flexibility 
allows  extensions  to  make  the  analysis  rule-specific  without 
modifying  the  language  or  the  underlying  system. 

Our  prior  work  has  shown  that  metal  works  well.  It 
requires  little  investment  to  get  results:  a  day’s  work  can 
produce  an  extension  that  finds  tens  or  even  hundreds  of  se¬ 
rious  errors  in  actual  code.  Further,  extensions  are  small  — 
usually  between  10  and  200  lines  of  code,  depending  mostly 
on  the  amount  of  error  reporting  that  they  do.  MetaFs 
flexibility  is  demonstrated  by  the  fact  that  we  were  able  to 
write  over  fifty  checkers  that  express  significantly  different 
types  of  analyses  including:  (1)  finding  violations  of  known 
correctness  rules  [1,  9]  and  (2)  automatically  inferring  such 
rules  from  source  code  [10] .  We  describe  metal  in  Sections  2 
through  4. 

We  have  three  main  requirements  for  xgcc,  it  must:  (1) 
provide  the  analysis  needed  to  find  bugs,  (2)  not  significantly 
restrict  what  metal  extensions  can  do,  and  (3)  scale  to  large 
programs.  Our  ideal  division  of  labor  is  that  extensions  en¬ 
code  only  the  property  to  check,  leaving  the  details  of  how 
to  check  the  rule  to  xgcc.  The  second  and  third  requirements 
are  important  since  the  more  rules  we  check  and  the  more 
code  we  analyze,  the  more  bugs  we  will  find.  The  main  re¬ 
striction  that  xgcc  places  on  extensions  is  determinism;  they 
can  otherwise  perform  arbitrary  computations  internally.  In 
this  paper,  we  present  the  analysis  algorithm,  implemented 
in  xgcc,  that  executes  our  extensions.  We  describe  xgcc  in 
Sections  5  and  6. 

In  Section  7,  we  discuss  the  approximations  that  our 
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1:  state  decl  any.pointer  vj 
2: 

3:  start:  {  kfree(v)  }  **>  v. freed; 

4: 

5:  V. freed:  {  *v  }  *=>  v.stop, 

6:  ■{  err("using  *As  after  free!",  mc_identif ier(v)) ;  } 

7:  I  {  kfree(v)  >  »=>  v.stop, 

8:  {  errC'double  free  of  %s!",  aic_identifier(v)) ;  } 

9:  ; 

Figure  1;  Free  Checker 

analyses  make  and  their  implications.  Section  8  discusses 
several  analysis  techniques  for  handling  false  positives  in¬ 
cluding  a  simple,  path-sensitive  analysis  for  eliminating 
nonexecutable  paths.  Section  9  continues  the  false  positive 
discussion  by  presenting  the  ways  in  which  xgcc  ranks  er¬ 
ror  reports.  Finally,  Section  10  discusses  related  work  and 
Section  11  concludes. 

2.  OVERVIEW 

Our  extensions  are  written  in  metal^  a  language  for  ex¬ 
pressing  a  broad  class  of  customized,  static,  bug-finding 
analyses.  The  common  thread  among  these  analyses  is  that 
they  all  exploit  the  fact  that  many  abstract  program  restric¬ 
tions  map  clearly  to  source  code  actions  [9].  While  metal  ex¬ 
tensions  are  executed  much  like  a  traditional  dataflow  anal¬ 
ysis,  they  can  easily  be  augmented  in  ways  outside  the  scope 
of  traditional  approaches,  such  as  using  statistical  analysis 
to  discover  rules  [10]. 

To  check  a  rule,  an  extension  does  two  things:  (1)  recog¬ 
nizes  interesting  source  code  actions  relevant  to  a  given  rule 
and  (2)  checks  that  these  actions  satisfy  some  rule-specific 
constraint.  Metal  organizes  extensions  around  a  state  ma¬ 
chine  (SM)  abstraction.  State  machines  are  a  concise  way  to 
represent  many  program  properties.  Note  that  the  SM  ab¬ 
straction  provides  sugar  for  common  operations,  it  does  not 
limit  extensions  to  checking  finite-state  properties.  When 
needed,  extensions  can  be  augmented  with  general-purpose 
code.  Metal  extensions  are  executed  by  the  interprocedural 
analysis  engine,  xgcc. 

Figure  1  shows  the  free  checker  that  flags  when  freed 
pointers  are  dereferenced  or  double-freed.  We  use  this 
checker  and  the  code  example  in  Figure  2  throughout  the 
paper.  The  extension  will  find  two  errors  in  the  example 
(lines  12  and  17). 

2.1  Metal  Extensions  and  State  Machines 

Metal  extensions  define  a  collection  of  one  or  more  state 
machines.  During  execution  of  an  extension,  the  current 
state  of  the  extension  is  simply  the  combination  of  all  the 
current  states  of  the  underlying  state  machines  that  the  ex¬ 
tension  defines.  Each  of  these  state  machines  is  logically 
separate:  transitions  in  one  SM  do  not  affect  any  of  the  oth¬ 
ers.  The  number  of  state  machines  grows  and  shrinks  during 
the  course  of  the  analysis. 

Each  individual  SM’s  current  state  consists  of  one  global 
state  value  and  one  or  more  variable-specific  state  val¬ 
ues.  Global  state  values  capture  a  program-wide  prop¬ 
erty  (e.g., “interrupts  are  disabled”).  Variable-specific  state 
values  capture  program  properties  associated  with  specific 
source  objects  (e.g.,  “pointer  p  is  freed”). 

Each  state  value  defined  above  is  assigned  to  an  instance 
of  a  state  variable.  Each  extension  defines  one  global  state 


l:int  contrived (int  *p,  int  ♦w,  int  x)  { 

2:  int  *q; 

3: 

4:  if(x) 

5:  { 

6:  kfree(w); 

7:  q  -  P; 

8:  p  “  0; 

9:  > 

10:  if(!x) 

11:  return  ♦w;  //  safe 

12:  return  ♦q;  //  using  ’q*  after  free! 

13:> 

14: int  contrived. caller  (int  *w,  int  x,  int  ♦p)  { 
15 :  kf ree  (p) ; 

16:  contrived  (p,  w,  x); 

17:  return  *w;  //  using  *w*  after  free! 

18:} 

Figure  2;  Free  Checker  Example  Code 


variable  and,  optionally,  a  variable-specific  state  variable. 
For  simplicity,  the  discussion  assumes  that  an  extension  has 
exactly  one  of  each.  A  state  variable  has  one  or  more  in¬ 
stances,  each  of  which  is  assigned  a  state  value.  The  global 
state  variable  has  exactly  one  instance  that  persists  through¬ 
out  the  analysis.  The  variable-specific  state  variable  has  one 
instance  for  each  program  object  with  an  attached  state. 
The  number  of  such  instances  grows  and  shrinks  as  the  anal¬ 
ysis  decides  to  track  new  program  objects  and  ignore  pre¬ 
viously  tracked  objects.  An  SM  state  consists  of  the  value 
of  the  global  instance  and  the  value  of  one  of  the  variable- 
specific  instances.  Thus,  the  number  of  SMs  defined  within 
each  extension  at  a  given  point  in  the  analysis  is  equal  to 
the  number  of  program  objects  with  attached  state. 

In  the  free  checker,  the  variable-specific  state  variable, 
V,  is  declared  with  the  keywords  state  decl.  The  notation 
V. freed  means  that  the  state  value  freed  is  bound  to  v.  Thus, 
only  instances  of  v  can  be  assigned  the  value  freed.  The 
global  state  variable  is  implicitly-defined.  The  state  value 
start  is  bound  to  the  global  state  variable  because  it  has  no 
explicit  binding. 

The  alphabet  of  each  SM  is  defined  by  the  metal  pat¬ 
terns  used  within  the  extension.  Patterns  are  used  to  iden¬ 
tify  source  code  actions  that  are  relevant  to  a  particular 
rule.  The  free  checker  uses  patterns  to  recognize  dealloca¬ 
tions  (using  the  pattern  “{kf ree  (v)  }” )  and  dereferences  of 
deallocated  variables  (using  the  pattern  “{♦v}”).  The  vari¬ 
able  v  in  these  patterns  will  match  pointers  of  any  type. 

Each  state  value  defines  a  list  of  transitions.  In  the  free 
checker,  the  start  state  defines  a  single  transition  rule  and 
the  v.freed  state  defines  two.  The  transition  for  the  start 
state  (line  3)  says  that  when  the  global  instance  has  the 
value  start  and  the  current  program  point  matches  the  pat¬ 
tern  {kfree(v)},  a  transition  should  execute  that  attaches 
the  state  freed  to  the  abstract  syntax  tree  (AST)  matching  v 
(i.e.,  the  freed  pointer).  The  transition  from  start  to  v.freed 
is  a  special  type  of  transition  that  creates  a  new  instance  of 
V  and,  thus,  a  new  state  machine. 

The  v.freed  state  value  has  two  transition  rules:  the  first 
triggers  when  a  freed  variable  is  dereferenced,  and  the  second 
triggers  when  a  freed  variable  is  freed  again.  Both  transi¬ 
tions  print  an  error  message  that  describes  the  error  and 
identifies  the  particular  variable  to  which  the  erroneous  ac¬ 
tion  was  applied.  A  transition  that  begins  in  a  variable- 


specific  state  value  is  triggered  by  a  specific  instance  of  the 
state  variable  bound  to  that  value.  Thus,  the  two  transi¬ 
tions  in  the  v. freed  state  are  triggered  when  one  of  the  freed 
variables  that  the  extension  is  tracking  is  either  double-freed 
or  dereferenced.  These  transitions  update  the  value  of  the 
instance  that  triggered  the  transition  to  the  special  value 
stop.  When  an  instance  is  assigned  the  value  stop,  the  state 
machine  tracking  that  instance  is  removed  from  the  exten¬ 
sion’s  collection  of  SMs.  However,  if  the  variable  associated 
with  the  instance  is  freed  again,  the  transition  in  the  start 
state  will  execute  and  thus  reinstantiate  the  deleted  SM. 

The  initial  state  of  an  extension  contains  one  state  ma¬ 
chine  that  expresses  the  fact  that  nothing  is  known  about 
the  program  at  the  start  of  the  analysis.  Thus,  the  global 
state  variable  in  the  free  checker  initially  has  the  value  start, 
and  V  has  the  special  value  <>  that  reflects  the  fact  that 
the  extension  does  not  know  about  any  freed  variables. 

xgcc  applies  an  extension  to  the  control  flow  graph 
(CFG)  for  a  single  function  in  depth-first  order,  one  exe¬ 
cution  path  at  a  time,  beginning  at  the  entry  points  to  the 
callgraph  for  the  source  base.  At  each  program  point,  the  ex¬ 
tension  looks  for  executable  transitions  in  any  of  the  current 
SMs.  After  iterating  over  all  the  SMs,  the  analysis  moves 
on  to  the  next  program  point.  As  described  in  Section  8, 
xgcc  also  enhances  the  extension  with  additional  analysis  to 
prune  non-executable  paths,  follow  simple  value  flow,  and 
delete  the  state  attached  to  an  expression  that  is  redefined. 

2.2  Execution  of  the  Free  Checker 

We  tie  all  of  these  pieces  together  by  following  the  exe¬ 
cution  of  the  free  checker  on  the  example  in  Figure  2. 

1.  Line  14:  contrived.caller  has  no  known  callers  and 
is,  thus,  an  entry  point  to  the  callgraph  for  our  exam¬ 
ple.  We  assume  that  none  of  the  input  parameters  are 
aliased.  The  extension  begins  in  the  initial  state. 

2.  Line  15:  The  kfree  call  will  match  the  pattern  in  the 
start  state  and  the  transition  on  line  3  of  the  checker 
will  execute,  attaching  the  state  freed  to  p. 

3.  Line  16:  xgcc  follows  the  call  to  contrived,  tracking 
the  variable  p  because  it  is  passed  as  a  parameter. 

4.  Line  4:  The  analysis  splits  down  the  true  and  false 
paths,  following  the  true  path  first.  When  the  anal¬ 
ysis  splits,  a  separate  copy  of  the  extension’s  state  is 
applied  to  each  path.  The  analysis  tracks  that  x  equals 
0  and  is  not  equal  to  0  down  each  respective  path. 

5.  Line  6:  The  call  to  kfree  places  w  in  the  freed  state. 
At  this  point,  there  are  two  instances  of  v  with  the 
value  freed:  p  and  w. 

6.  Line  7:  The  assignment  causes  xgcc  to  transparently 
create  another  instance  of  v  for  the  variable  q,  also  in 
the  freed  state. 

7.  Line  8:  The  assignment  to  variable  p  causes  xgcc  to 
transition  p  to  the  stop  state,  removing  p  from  the 
extension’s  state. 

8.  Line  10:  Rather  than  splitting  at  the  conditional,  xgcc 
uses  the  information  that  x  is  non-zero  on  this  path 
to  prune  the  true  branch.  If  the  true  branch  were 
followed,  there  would  be  a  false  error  report  at  line  11 
because  w  has  attached  state  freed  (line  6). 


9.  Line  12:  The  dereference  pattern  for  v. freed  matches  ♦q 
and  reports  a  use-after-free  error,  q  is  transitioned  to 
the  stop  state.  After  analyzing  the  return,  the  analysis 
backtracks  to  follow  the  false  branch  from  line  4. 

10.  Line  10:  Rather  than  splitting  at  the  conditional,  xgcc 
uses  the  information  that  x  is  equal  to  0  on  this  path 
to  prune  the  false  branch. 

11.  Line  11:  The  path  ends.  We  have  explored  all  paths 
through  contrived. 

12.  Line  17:  Control  returns  to  the  caller.  The  set  of  out¬ 
going  instances  of  v  is  the  union  of  all  instances  active 
at  the  exit  from  any  path  through  contrived.  There 
are  two  such  instances,  p  and  w,  active  at  lines  11  and 
12,  respectively.  The  extension  flags  an  error  at  the 
subsequent  dereference  on  line  17. 

The  next  two  sections  describe  metal  in  more  detail. 

3.  METAL  STATES  AND  TRANSITIONS 

3.1  Metal  States 

Each  state  variable’s  domain  consists  of  all  the  state 
values  bound  to  that  variable.  This  section  elaborates  the 
discussion  of  state  variables  and  provides  a  more  precise  defi¬ 
nition  of  the  extension’s  state  and  each  state  machine  within 
it.  The  definition  of  extension  state  that  we  describe  here  is 
translated  to  the  data  structures  described  in  Section  5  that 
define  an  extension  from  race’s  perspective. 

The  extension  must  be  allowed  to  extend  the  state  space 
using  general-purpose  code.  The  advantage  of  this  form  of 
flexibility  is  that  it  allows  our  extensions  to  express  proper¬ 
ties  where  the  state  space  is  defined  dynamically. 

We  allow  extensions  to  grow  the  state  space  by  extend¬ 
ing  the  domain  of  each  instance  within  general  purpose  code. 
For  this  reason,  we  enhance  each  variable-specific  instance 
with  a  data  value  that  is  a  C  structure  of  arbitrary  size  that 
the  extension  can  manipulate  within  the  escapes  to  C  code. 
Extensions  may  also  update  the  value  of  the  global  instance 
directly  within  an  escape  to  C  code  to  allow  more  complex 
transitions. 

An  extension’s  state  is  defined  as  a  set  of  state  tuples, 
each  of  which  corresponds  to  a  single  SM  contained  within 
that  extension.  A  state  tuple  has  one  component  that  is 
filled  by  the  value  of  the  global  instance.  In  the  free  exam¬ 
ple,  this  slot  always  contains  the  value  start.  The  second 
component  contains  the  value  of  a  variable-specific  instance 
(e.g.,  an  instance  of  v  in  the  free  checker).  For  example, 
after  analyzing  line  15  in  Figure  2,  the  free  checker’s  state 
would  include  the  tuple  {start,  v  :  p  i“+  freed)  because  the 
state  variable  v  has  an  instance  attached  to  the  program 
object  p  whose  value  is  freed. 

While  the  state  tuples  in  this  paper  have  only  two  com¬ 
ponents,  the  actual  implementation  of  metal  allows  the  ex¬ 
tension  to  define  tuples  with  additional  components.  The 
actual  implementation  of  the  algorithms  in  this  paper  han¬ 
dles  the  more  general  case. 

3.2  Metal  Transitions 

A  simple  metal  transition  consists  of  a  source  state  value, 
a  pattern,  and  a  destination  state  value.  The  transition  on 
line  3  of  the  free  checker  follows  this  template.  The  extension 


state  decl  {  locket  }  1; 


start: 

{trylockCD  !■  0}  *»>  tnie**l. locked,  false-l.stop 
1  {trylock(l)  “  0}  =*>  true=l.stop,  false=l . locked 
1  {lock(l);}  «*>  1. locked 
1  -Cunlockd) ;}  ==> 

{  err('7.s  is  not  locked",  mc^identifier  (1));  } 

f 

1. locked: 

{lock(l);}  II  {trylock(l)}  *«> 

{  errC'dbl.  lock  of  */s",  mc_identifier  (1));  } 
I  {unlock(l);}  **>  1. unlocked 
I  $end_of_path$  **> 

{  err ("path  ends  with  lock  held");  > 


Figure  3:  Lock  checker 


determines  which  transitions  to  execute  by  iterating  through 
both  global  and  variable-specific  instances  and  determining 
whether  the  value  of  each  instance  defines  a  transition  that 
can  execute.  A  transition  can  execute  if  its  pattern  matches 
at  the  current  point  in  the  analysis.  An  instance  cannot 
trigger  a  transition  at  the  statement  where  that  instance 
was  created;  this  restriction  prevents  a  variable  that  is  freed 
for  the  first  time  from  triggering  a  double-free  error  at  the 
same  program  point.  Simple  transitions  can  be  enhanced 
with  path-specific  destination  states  and  C  code  actions. 

Path-specific  transitions.  Path-specific  transitions 
allow  the  extension  to  track  the  value  of  simple  boolean 
predicates  (e.g.,  1  is  locked,  p  is  null)  or  model  functions 
that  can  have  two  possible  outcomes.  If  a  transition  occurs 
at  a  branch  condition  in  the  source  code,  the  extension  can 
specify  a  different  destination  state  depending  on  whether 
the  analysis  follows  the  true  branch  or  the  false  branch  from 
the  condition.  Figure  3  shows  the  lock  checker,  which  warns 
when  locks  are  (1)  released  without  being  acquired,  (2)  dou¬ 
ble  acquired,  or  (3)  not  released  at  all.  The  routine  trylock, 
used  for  nonblocking  lock  acquisition,  returns  1  if  it  acquires 
the  lock  and  0  otherwise.  Thus,  in  the  first  transition,  we 
attach  the  state  locked  to  the  lock  on  the  true  path,  and  the 
state  stop  to  the  lock  on  the  false  path.  The  special  pattern 
$end_of  _path$  in  the  last  transition  evaluates  to  true  when 
either  an  instance  of  1  in  the  locked  state  permanently  leaves 
scope  or  when  the  program  terminates. 

C  Code  actions.  Transitions  can  include  C  code  ac¬ 
tions  that  execute  whenever  the  transition  executes.  Actions 
are  another  way  that  an  extension  can  extend  the  basic  SM 
abstraction.  C  code  actions  allow  the  extension  to  perform 
arbitrary  computations  whenever  a  transition  executes.  We 
describe  two  types  of  actions  that  we  have  found  useful: 
those  that  perform  complex  error  reporting  and  those  that 
enhance  the  analysis  machinery. 

To  make  error  messages  useful,  checkers  must  report  not 
only  what  the  error  was,  but  also  why  the  error  occurred. 
Thus,  all  of  our  checkers  track  the  calculations  that  found 
each  error.  These  calculations  depend  on  the  particular 
characteristics  of  the  extension.  The  code  to  track  why  an 
error  was  flagged  accounts  for  the  bulk  of  each  extension. 

In  [10],  we  describe  several  checkers  that  use  statisti¬ 
cal  analysis  to  infer  checking  rules.  For  example,  to  infer 
whether  routines  a  and  b  must  be  paired:  (1)  assume  that 
they  must,  (2)  count  the  number  of  times  they  occur  to¬ 


gether  and  (3)  count  the  number  of  times  they  do  not  (rule 
violations).  The  reported  violations  are  then  sorted  using 
a  statistical  significance  test.  We  implemented  this  func¬ 
tionality  by  using  the  C  code  actions  to  count  the  correct 
pairings  and  violations  during  the  analysis.  (Section  9  uses 
the  same  technique  to  rank  rule  violations.) 

By  default,  a  metal  extension  has  a  finite,  statically  de¬ 
termined  domain  for  each  state  variable.  The  extension  can 
extend  this  model  by  using  C  code  actions  to  manipulate 
the  extension’s  state  directly  using  rupee’s  internal  interface. 
For  example,  we  could  extend  the  lock  checker  described 
above  to  handle  recursive  locks  by  using  the  data  values  in 
each  instance  of  1  to  track  the  current  depth  of  the  lock. 
Whenever  a  lock  operation  or  an  unlock  operation  occurs, 
the  resulting  transition  could  either  increment  or  decrement 
the  lock  depth  within  the  C  code  action.  If  this  depth  ever 
went  below  0  or  exceeded  a  small  constant,  the  extension 
would  report  an  incorrect  lock  pairing. 

Composition  is  another  mechanism  extensions  can  use 
to  enhance  the  SM  model.  Extensions  can  be  composed  such 
that  each  extension  uses  the  results  of  the  previous  one  in 
its  own  analysis.  Extensions  implement  this  composition  by 
using  xgcc^s  internal  interface  to  annotate  the  ASTs  with 
arbitrary  data  values.  Subsequent  extensions  can  retrieve 
and  use  these  values.  One  common  use  of  composition  is 
the  path-kill  extension  [10],  which  flags  all  calls  to  panic 
so  that  subsequent  analyses  will  not  report  errors  on  paths 
dominated  by  these  calls.  When  a  subsequent  extension  sees 
a  flagged  function  call,  it  stops  traversing  the  current  path. 

4.  METAL  PATTERNS 

Metal  patterns  provide  a  simple  way  for  extensions  to 
identify  source  actions  that  are  relevant  to  a  particular  rule. 
Patterns  are  written  in  an  extended  version  of  the  source 
language  (C)  and  can  specify  almost  arbitrary  language  con¬ 
structs  such  as  declarations,  expressions,  and  statements. 
Patterns  are  easy  to  use  because  they  syntactically  mirror 
the  source  constructs  that  they  are  intended  to  match. 

A  base  pattern  in  metal  is  a  bracketed  code  fragment 
written  in  our  augmented  version  of  C.  Base  patterns  can 
be  composed  with  the  logical  connectives  kk  and  1 1 .  The 
simplest  base  patterns  in  metal  syntactically  match  the  code 
that  the  extension  wishes  to  recognize.  Because  we  match 
ASTs,  spaces  and  other  lexical  artifacts  do  not  interfere 
with  matching.  For  example,  the  base  pattern  {randO} 
will  match  all  calls  to  the  rand  function. 

A  simple  pattern  could  not,  for  example,  match  all 
pointer  dereferences  because  each  dereference  refers  to  a  dif¬ 
ferent  pointer.  The  pattern  on  line  5  in  the  free  checker 
matches  all  dereferences  with  a  metal  hole  variable.  Any 
metal  variable  declared  with  the  keyword  decl  is  a  hole  vari¬ 
able.  Hole  variables  let  patterns  contain  positions  where  any 
source  construct  of  the  appropriate  type  will  match. 

Hole  variables  in  metal  must  be  typed.  If  a  hole  variable 
is  assigned  a  C  type,  the  hole  can  be  “filled”  by  any  expres¬ 
sion  of  that  type.  To  match  all  pointer  dereferences  in  the 
free  checker,  though,  we  cannot  assign  v  any  single  C  type. 
Metal  introduces  new  meta  types  that  broaden  holes  to  an 
entire  class  of  related  types.  The  hole  variable  v  is  declared 
with  the  meta  type  any_pointGr,  which  matches  pointers  to 
storage  of  any  type.  Table  1  lists  the  hole  types  and  their 
meanings. 

If  the  same  hole  variable  appears  multiple  times  in  a  pat- 


Hole  Type 

Matches 

Any  C  type 

any  expression  of  that  type 

any.expr 

any  legal  expression 

any^scalar 

any  scalar  value  (int,  float,  etc.) 

any^pointer 

any  pointer  of  any  type 

any-arguments 

any  argument  list 

any-fn-call 

any  function  call 

Table  1;  Hole  types  and  their  meanings. 


tern,  each  appearance  must  contain  equivalent  ASTs,  For 
example,  the  pattern  {foo(x,x)}  matches  calls  of  the  form 
foo(0,0)  and  foo(a[i]  ,a[i]),  but  not  foo(0,l). 

A  hole  variable  used  within  an  action  (as  opposed  to 
a  pattern)  refers  to  the  AST  node  that  matches  the  hole. 
Thus,  the  use  of  v  on  line  8  in  the  free  checker  refers  to  the 
AST  for  the  freed  pointer  matched  on  line  7. 

Callouts  let  programmers  extend  the  matching  language 
to  express  unanticipated  or  linguistically  awkward  features 
by  writing  boolean  expressions  in  C  code  that  determine 
whether  a  match  occurs.  Callouts  are  identified  syntactically 
by  appending  the  prefix  $  to  a  base  pattern. 

The  degenerate  callouts,  ${0}  and  ${l},  match  nothing 
and  everything  respectively.  Callouts  are  most  often  used  as 
a  conjunct  that  refines  a  more  general  pattern.  For  example, 
{  fn(args)  }  &&  ${  mc_is^call_to(fn,  "gets")  } 

refines  a  pattern  that  matches  all  function  calls  to  one  that 
only  matches  calls  to  gets.  The  variable  f  n  is  a  hole  variable 
of  type  anyjfn_call,  and  the  variable  args  is  a  hole  with 
type  any  arguments.  This  pattern  could  have  been  written 
as  literal  C  code  as  well. 

Used  alone,  callout  functions  can  only  refer  to  the  cur¬ 
rent  program  point,  mc_stmt,  and  any  global  state  either 
within  the  extension  or  within  xgcc.  Used  as  a  conjunct  or 
disjunct  with  other  patterns,  the  callout  can  refer  to  the 
hole  variables  used  in  these  patterns  as  arguments  (see  fn 
in  the  example  above),  xgcc  provides  an  extensive  library  of 
functions  useful  as  callouts. 

Legal  patterns  can  specify  any  C  expression  or  statement 
(including  loops,  conditionals,  or  switch  statements)  with 
two  restrictions.  First,  all  identifiers  in  the  pattern  must  be 
either  hole  variables  defined  in  the  extension  or  legal  names 
in  the  scope  of  the  code  base  being  checked.  Second,  the 
C  constructs  used  in  the  pattern  must  compile  in  isolation. 
Example  illegal  patterns  include  a  single  case  arm  without 
any  enclosing  switch  statement;  an  isolated  break;  etc.  All 
of  these  constructs  can  be  matched  with  a  callout. 

5.  INTRAPROCEDURAL  ANALYSIS 

This  section  describes  our  intraprocedural  algorithm 
that  applies  metal  extensions  to  a  source  base.  The  goal 
of  this  algorithm  is  to  execute  checkers  efficiently  without 
compromising  metaVs  flexibility. 

Extensions  are  applied  to  each  AST  in  a  single  path  in 
execution  order.  Execution  order  means  that  the  tree  for 
each  individual  statement  is  visited  in  the  order  that  the 
corresponding  instructions  would  execute.  For  example,  a 
function  call’s  arguments  are  visited  before  the  call;  an  as¬ 
signment’s  right-hand  side  is  visited  first,  then  the  left-hand 
side,  then  the  assignment.  We  refer  to  AST  nodes  as  pro¬ 
gram  points.  At  each  program  point,  the  extension  decides 


whether  to  execute  any  transitions  and  which  transitions  to 
execute. 

We  implement  this  traversal  with  a  simple  depth-first 
search  (DFS)  of  the  CFG  starting  at  the  entry  block.  Thus, 
the  algorithm  follows  a  single  control  path,  traversing  each 
block  along  this  path  until  the  end  of  the  function,  then 
backtracks  to  the  last  branch  point.  The  DFS  portion  of 
the  analysis  is  straightforward;  the  important  feature  of  the 
analysis  is  the  use  of  block-level  state  caching  for  speed.  The 
algorithm  records  the  extension  state  in  each  basic  block 
before  traversing  that  block.  At  a  subsequent  traversal  of 
the  same  block,  the  traversal  is  aborted  and  the  analysis 
backtracks  to  the  last  branch  point  if  the  extension  state  is 
contained  within  this  cache. 

We  first  describe  how  to  execute  an  extension  at  a  single 
program  point.  We  then  describe  caching  at  the  block  level. 
Finally,  we  outline  the  pseudocode  for  the  DFS  algorithm. 

5.1  Applying  an  extension  to  a  program  point 

Figure  4  shows  a  simplified  version  of  the  DFS  algo¬ 
rithm.  We  describe  the  data  structures  below. 

Each  variable-specific  instance  (var_state)  consists  of 
an  integer  holding  a  state  value,  a  tree  for  the  program  ob¬ 
ject  to  which  the  state  is  attached,  and  an  extension-defined 
data  value  of  arbitrary  size.  The  tree  in  the  var  field  can  be 
any  tree  in  the  code  (e.g.,  an  1- value,  a  general  expression, 
a  statement). 

An  extension’s  state  is  represented  by  an  sm.instance 
structure,  which  has  three  main  components:  (1)  the  exten¬ 
sion’s  single  global  state,  gstate,  (2)  a  list  of  all  variable- 
specific  instances,  active.vars,  and  (3)  a  pointer  to  the 
extension  code,  sm_fn.  Modifications  to  both  gstate  and 
active.vars  are  private  to  each  path:  mutations  revert 
when  the  extension  backtracks. 

The  extension  code  performs  the  following  functions:  (1) 
it  determines  which  transitions  to  execute  and  (2)  it  exe¬ 
cutes  these  transitions.  Together,  these  two  steps  specify 
the  transfer  functions  for  the  analysis.  When  a  transition 
does  execute,  it  can  have  one  of  the  following  effects  on  the 
sm_instance  structure:  (1)  it  can  alter  gstate,  (2)  it  can 
add  or  remove  elements  from  active.vars,  (3)  it  can  alter 
the  state  and/or  data  value  of  a  member  of  active_vars,  or 
(4)  it  can  leave  the  sm_instance  unchanged. 

To  make  the  analysis  algorithm  efficient,  we  exploit  the 
fact  that  if  the  extension  is  deterministic^  applying  the  ex¬ 
tension  to  the  same  program  point  in  the  same  state  will 
always  produce  the  same  result.  Thus,  we  only  need  to  ap¬ 
ply  the  extension  to  each  program  point  once  in  each  state. 
More  precisely,  the  determinism  condition  that  we  require 
says  that  given  a  single  state  tuple  and  a  program  point, 
if  we  set  the  extension’s  state  to  that  tuple  and  apply  the 
sm_f n  function  to  the  program  point,  it  will  always  produce 
the  same  transformations  to  the  sm_instancG  structure.  In 
addition,  we  require  that  each  state  tuple  is  a  logically  sep¬ 
arate  state  machine.  We  revisit  the  latter  condition  below. 

5.2  Caching 

Prom  xgcc^s  perspective,  the  state  of  an  extension  is 
viewed  as  a  set  of  state  tuples  represented  as  pairs,  (gstate , 
v),  where  gstate  is  the  extension’s  global  instance  and 
v  is  either  a  state  variable  instance  from  active _vars  or 
the  distinguished  placeholder  “<>.”  The  placeholder  en¬ 
sures  that  when  the  analysis  begins,  the  extension  state 


//  instance  of  a  state  variable 
struct  var^state  { 

AST  var;  //  AST  for  var 

int  s;  //  state  of  var 

ANY  data;  //  extension-specif ic  data 

}; 

//  a  summary  edge, 
struct  edge  { 

struct  point  { 

int  gstate;  //  global  state 

var_state  v;  //  state  var  instance 

>  start,  end; 

}; 

struct  block  { 

block  succs [] ;  //  successors ;  includes  backedges 

AST  trees [];  //  block’s  trees  in  execution  order 

set  edge  blk.add; 
set  edge  blk.transition; 

*  set  edge  sfx.add; 

♦  set  edge  sfx.transition; 

}; 

struct  sm_ instance  { 

int  gstate;  //  global  state 

set  var_state  active_vars;  //  instemces 
sm_fn(sm_instance,  AST);  //  SM  function 

}; 

//  Build  set  of  all  vars  not  in  block  summary 
set  cache_misses(sm,  b)  { 
s  -  {}; 

foreach  v  in  sm.active.vars 

if ((si,  s2)  in  b.blk.transition 

where  si  “  (sm. gstate,  v)) 
s  U=  v; 

return  sm. act ive_ vars  -  s; 

} 

//  DFS  traversal 

void  traverse_cfg(sm,  backtrace,  caller,  b)  { 
push  (backtrace,  b); 
sm.active_vaa:s  **  cache_misses(sm,  b); 

//  prune  path  if  visited  block  in  current  state  before 
if (sm. active _ vars  -  {}) 

*  relax  (backtrace) ; 
return; 

sm*  »  copy(sm) ; 

//  apply  extension  function  to  each  AST  node  in  block 
foreach  tree  t  in  b. trees  { 
sm->sm_fn  (sm,  t) ; 

♦  if  (t  is  function  call)  { 

♦  //  t  is  last  tree  in  b;  b  has  exactly  one  succ 

*  follow_call(sm, backtrace, caller, t,b->succs) ; 

*  return; 

} 

} 

//  compute  add  and  transition  edges 
foreach  v  in  sm.active_vars  { 
e  »  (sm . gstate ,  v) ; 

//  if  V  was  active  at  block  entry;  create  a 
//  transition  edge. 

if  V*  in  sm* .active _ vars  where  v.tree  *  v’.tree 
b.blk_ transit ion  U=  ((sm* .gstate,  v*),  e) ; 

//  otherwise  v  was  created  by  b:  create  an  add  edge, 
else 

V*  -  (v.tree,  unlmown,  nil); 
b.blk^add  U=  ((sm* .gstate,  v*),  e) ; 

} 

if  is_exit_block(b) 

♦  relax (backtrace) ; 
else 

//  apply  successor  blocks  to  copy  of  current  sm 
foreach  s  in  b->succs 

traverse_cfg(copy(sm) , copy (backtrace) , caller, s) ; 

} 

Figure  4:  Depth-first  CFG  traversal.  Lines  marked 
with  a  *  are  only  relevant  to  the  interprocedural 
case. 


contains  exactly  one  state  tuple.  For  example,  the  initial 
state  of  the  free  checker  would  be  represented  by  the  set 
{(start,  <>)},  and,  after  the  first  free  at  line  15,  would 
change  to  {(start,  <>),  (start,  v  :  p  ^  freed)}. 

As  we  described  in  Section  3,  an  extension’s  state  is 
represented  as  a  set  of  state  tuples.  Each  basic  block,  6, 
contains  a  block  summary  that  records  the  union  of  all  ex¬ 
tension  states  that  reach  that  block  and  also  records  how  the 
SM  corresponding  to  each  tuple  is  transitioned  during  the 
analysis  of  that  block.  Basic  blocks  are  xgcc's  internal  repre¬ 
sentation  of  the  CFG  for  a  function.  The  transitions  caused 
by  the  basic  block  are  visible  to  xgcc  through  modifications 
to  the  current  sm.instance  structure.  We  divide  the  po¬ 
tential  ways  an  sm_instance  can  change  while  traversing  a 
single  block  into  two  categories:  (1)  transitions  that  change 
the  value  of  either  the  global  instance  or  a  variable-specific 
instance  and  (2)  additions  that  create  a  new  variable-specific 
instance.  The  summary  for  a  block,  6,  represents  these  ef¬ 
fects  using  two  types  of  directed  edges: 

1.  Transition  edges:  (s,t;  :  f  Va)  (s',v  :  t  ^  Us). 
The  initial  state  tuple  specifies  that  at  the  entry  to  6, 
the  global  instance  had  the  value  s  and  there  was  an  in¬ 
stance  of  state  variable  v  with  value  Va  attached  to  the 
program  object  f.  The  final  state  tuple  specifies  that, 
during  the  analysis  of  the  block,  the  SM  correspond¬ 
ing  to  the  initial  state  tuple  transitioned  to  the  state 
where  the  global  instance  has  value  s'  and  the  variable- 
specific  instance  for  t  has  value  v^.  Each  state  tuple 
that  reaches  a  block  generates  exactly  one  transition 
edge,  where  the  transition  can  be  the  identity. 

Figure  5  shows  the  CFG  for  the  example  in  Figure  2. 
The  first  row  in  each  block  in  the  figure  shows  the 
block  summary.  An  example  transition  edge  from  the 
block  summary  in  block  7  is:  (start,  v  :  p  •-♦  freed)  — > 
{start,y  :  p  stop).  This  says  that  the  free  checker 
enters  block  7  in  the  global  state  start  with  an  instance 
for  p  in  the  freed  state  and  p  is  transitioned  to  the  stop 
state  (killed)  during  the  analysis  of  block  7. 

2.  Add  edges:  (s, u  :  t  i->  unknown)  — +  {s\v  :  t  »-►  u'). 
The  add  edge  says  that  when  the  global  instance  has 
initial  value  s,  a  new  instance  of  v  that  attaches  state 
Vs  to  t  is  created  while  traversing  the  block.  The  start 
tuple  for  an  add  edge  contains  the  special  value  v  : 
1 1—^  unknown  because  the  edge  only  applies  when  we 
know  nothing  about  t  at  the  entry  to  6. 

An  example  add  edge  for  block  2  in  Figure  5  would  be: 
(start,  v  :  p  !-►  unknown)  {start,  v  :  p  freed). 
At  block  2’s  entry,  the  global  instance  has  the  value 
start.  At  its  exit,  the  global  instance  still  has  the  value 
start,  but  the  variable  p  now  has  attached  state  freed. 
The  need  for  the  special  value  in  the  start  tuple  is  clear 
if  we  consider  that  if  we  knew  that  p  was  freed  at  the 
entry  to  block  2,  we  would  report  a  double-free  error 
instead  of  transitioning  p  to  the  freed  state. 

The  block  summary  is  the  union  of  all  add  and  transition 
edges  produced  by  that  block.  Before  applying  the  exten¬ 
sion  to  a  block,  the  analysis  converts  the  current  extension 
to  a  set,  s,  of  state  tuples.  It  then  removes  any  tuple,  e, 
from  s  that  is  equivalent  to  the  initial  state  tuple  for  some 
transition  edge.  After  this  process,  if  s  is  empty,  the  traver¬ 
sal  of  the  current  path  is  aborted.  After  a  block  is  traversed. 


the  transition  edges  for  each  e  €  s  and  the  add  edges  are 
both  added  to  the  summary. 

Note  that  while  the  intraprocedural  algorithm  does  not 
use  either  the  add  edges  or  the  destination  tuple  of  the  tran¬ 
sition  edges,  they  are  crucial  for  the  interprocedural  caching 
described  in  the  next  section. 

Our  algorithm  computes  a  fixed  point  that  is  similar  to 
the  meet-over-paths  solution  in  a  traditional  dataflow  anal¬ 
ysis  [16].  The  analysis  stops  when  the  block  summary  (i.e., 
cache)  at  each  block  contains  all  state  tuples  that  can  reach 
that  block  along  any  control  path  (i.e.,  the  maximal  fixed- 
point  solution). 

The  algorithm  in  this  section  adds  an  additional  restric¬ 
tion  to  metal  extensions  beyond  determinism.  The  transi¬ 
tions  that  a  variable-specific  instance  attached  to  program 
object  V  undergoes  at  a  program  point  cannot  be  affected 
by  the  presence,  absence,  or  state  of  any  other  instance  at¬ 
tached  to  object  v\  This  independence  condition  allows  us 
to  combine  all  state  tuples  that  reach  a  block  into  a  single  set 
in  the  block  summary  because  the  state  tuples  represent  in¬ 
dependent  state  machines  that  could,  logically,  execute  sep¬ 
arately.  Without  independence,  the  number  of  times  that 
we  analyze  each  program  point  would  grow  exponentially 
with  the  number  of  variable-specific  instances.  With  inde¬ 
pendence,  this  number  scales  linearly  with  the  number  of 
these  instances.  Note  that  transitions  on  a  variable-specific 
instance  can  be  coupled  to  the  value  of  the  global  instance. 

5.3  DFS  With  Caching  Pseudocode 

An  extension,  sm,  is  applied  to  a  procedure,  /,  by  calling 
the  routine  traverse.cf g  in  Figure  4  with  four  arguments: 
sm,  which  is  initialized  to  the  start  state,  an  empty  stack,  the 
caller  (relevant  in  the  interprocedural  case),  and  the  entry 
block  to  /’s  CFG.  In  the  start  state,  gstate  is  initialized  to 
the  first  state  in  the  extension  text  {start  for  the  free  checker) 
and  the  active.vars  set  contains  one  element  in  the  special 
<>  state  so  that  the  extension’s  state  consists  of  exactly  one 
state  tuple.  This  element  persists  throughout  the  analysis, 
but  it  is  ignored  whenever  active_vars  is  nonempty.  Thus, 
we  omit  it  from  the  block  summaries  in  Figure  5  that  contain 
at  least  one  other  element. 

The  routine  traverse _cfg  implements  the  depth-first 
search  with  caching.  This  routine  is  mostly  a  standard  recur¬ 
sive  DFS  except  that  at  the  entry  to  each  new  basic  block,  6, 
it  calls  the  function  cachejmisses  to  determine  if  the  current 
extension  state  is  a  subset  of  the  block  summary  as  discussed 
above.  cache_misses  returns  an  updated  active.vars  set 
such  that  the  sm.instance  with  the  new  set  will  only  contain 
state  tuples  that  were  not  in  the  block  summary.  If  all  of 
the  tuples  in  the  current  sm_instance  are  in  the  block  sum¬ 
mary,  the  DFS  backtracks  to  the  last  branch  point.  If  not, 
traverse_cf  g  applies  the  extension  code  to  every  tree  in  6  in 
execution  order  and  then  traverses  6’s  successors.  Successors 
are  applied  to  a  copy  of  the  current  extension  state. 

6.  INTERPROCEDURAL  ANALYSIS 

This  section  describes  our  context-sensitive,  interproce- 
dural  analysis.  At  a  high  level,  it  works  as  follows: 

1.  The  first  preprocessing  pass  compiles  each  file  in  isola¬ 
tion,  emitting  ASTs  to  a  temporary  file.  These  emit¬ 
ted  files  include  all  type  declarations,  variable  declara¬ 
tions,  and  code  within  the  source  file  and  are  typically 


four  or  five  times  larger  than  the  text  representation. 

2.  The  second  analysis  pass  reads  these  temporary  files, 
reassembles  their  ASTs,  and  constructs  the  CFG  and 
call  graph.  Functions  with  no  callers  are  considered 
roots.  When  computing  roots,  recursive  call  chains 
are  broken  arbitrarily. 

3.  The  system  applies  each  extension  to  the  CFG  with 
a  DFS  traversal  starting  at  each  callgraph  root.  On 
each  function  call,  the  system  retrieves  the  CFG  for  the 
callee  and  restarts  the  traversal  there.  The  extension 
state  is  refined  at  the  call  boundary  and  restored  at 
the  return.  The  rules  for  refine  and  restore  follow  C 
scoping  rules  unless  the  extension  specifies  otherwise. 

By  default,  if  the  function’s  CFG  is  not  available,  the 
system  silently  continues  to  the  next  CFG  node. 

To  make  the  DFS  algorithm  efficient,  we  add  a  summary 
cache  to  each  function  computed  by  combining  the  block 
summaries.  This  cache  is  checked  at  each  function  call.  Sim¬ 
ilar  to  the  intraprocedural  caching,  if  a  hit  occurs,  the  call 
is  not  followed.  Unlike  the  intraprocedural  case,  however, 
we  cannot  simply  abort  the  current  path  when  there  is  a 
cache  hit  at  a  function  boundary.  Because  there  are  many 
callsites  for  each  function,  we  may  not  have  analyzed  the 
code  after  the  call  in  the  current  state.  Thus,  on  a  cache 
hit,  we  use  both  add  and  transition  edges  to  update  the 
sm_instance  and  the  traversal  resumes  after  the  function 
call.  The  function  summary  memoizes  the  results  of  the 
state  transformation  defined  by  each  function. 

Our  algorithm  does  not  require  that  the  extension  has  a 
finite  state  space,  or  that  the  state  space  is  even  known  when 
the  analysis  begins.  The  algorithm  that  we  describe  here 
is  inspired  by  the  dynamic  programming  algorithm  in  [18], 
but  the  algorithm  in  [18]  requires  that  the  state  space  of  the 
analysis  is  finite.  The  resulting  practical  difference  is  that 
our  algorithm  executes  metal  extensions  top-down.  Thus, 
rather  than  analyzing  each  function  starting  from  all  pos¬ 
sible  states,  we  only  analyze  each  function  starting  in  the 
states  that  can  reach  that  function  along  an  interprocedu- 
rally  valid  path  (i.e.,  an  interprocedural  path  that  respects 
call  and  return  sites). 

6.1  Refine  and  Restore 

State  refinement  occurs  when  a  function  call  is  encoun¬ 
tered  and  that  function  call  is  followed.  The  state  is  restored 
when  the  analysis  returns  from  the  callee  and  resumes  an¬ 
alyzing  the  caller.  The  extension’s  global  instance  passes 
across  the  function  call  boundary  unchanged. 

When  the  call  is  followed,  any  object  that  passes  from 
the  caller’s  scope  to  the  callee’s  scope  should  retain  its  state. 
This  operation  often  requires  moving  the  state  from  an  ob¬ 
ject  in  the  caller’s  scope  to  the  corresponding  object  in  the 
callee’s  scope.  When  the  call  returns,  the  restore  opera¬ 
tion  may  need  to  move  the  state  back  from  an  object  in  the 
callee’s  scope  to  the  appropriate  object  in  the  caller’s  scope 
and,  potentially,  restore  the  original  state  in  the  caller.  In 
addition,  any  variable-specific  instances  that  left  scope  when 
the  call  was  followed  should  reappear  when  the  call  returns. 

We  refine  and  restore  the  extension  state  at  a  function 
call  according  to  the  list  of  rules  in  Table  2.  Each  rule 
lists  the  actual  parameter,  the  formal  parameter,  the  object 
whose  state  needs  to  be  transferred,  and  how  this  state  is 


Actual 

Formal 

state  in 

Refine  rule 

Restore  rule 

Xa 

Xf 

X{i 

state  {xf)  =  state  (aja) 

state  (xa)  =  state  (x/)  (by  reference) 
or  state  (xa)  unchanged  (by  value) 

Xf 

Xa 

state  (*x/)  =  state  (xa) 

state  (xa)  =  state  (*x/) 

Xa 

Xf 

Xa^  field 

state  (xf. field)  =  state  (xa^ field) 

state  {xa. field)  =  state  {xf. field)  (reference) 
or  state  (xa- field)  unchanged  (value) 

Xa 

Xf 

Xa->field 

state  {xf-> field)  =  state  {xa-> field) 

state  {xa-> field)  =  state  {xf-y field) 

Xa 

Xf 

*Xa 

state  (♦X/)  =  state  (♦Xo) 

state  (*Xa)  =  state  (*x/) 

T^ble  2;  Refine  and  restore  semantics  for  retargeting  the  analysis  across  a  function  call.  The  final  four  rules 
actually  apply  at  all  levels  of  indirection  (e.g.,  p  is  the  argument,  **p  has  state).  Note  that  the  extension  writer 
may  specify  whether  or  not  the  actual  parameter  should  be  treated  as  pass  by  value  or  pass  by  reference. 


Figure  5;  Supergraph  for  the  example  code  shown  in  Figure  2.  The  top  field  in  each  basic  block  shows  the 
block  summary,  the  middle  field  shows  the  suffix  summary,  and  the  bottom  field  shows  the  source  code  in 
the  block.  Each  block’s  number  is  listed  in  the  first  field.  Note  that  none  of  the  suffix  summaries  record  any 
information  about  q  because  q  is  a  local  variable  so  the  analysis  would  never  use  these  edges.  Edges  that 
start  and  end  in  a  tuple  containing  the  placeholder  <>  are  omitted  from  the  cache  unless  this  tuple  is  the 
only  element  in  the  cache.  Also,  the  suffix  summary  intentionally  omits  edges  that  end  in  a  tuple  with  the 
value  stop.  Suffix  edges  are  only  relaxed  along  traversed  paths,  i.e.  those  not  suppressed  by  the  algorithm 
described  in  Section  8.  The  analysis  does  not  follow  calls  to  kfree  because  the  extension  matches  these  calls. 
Thus,  they  are  not  considered  callsites  in  the  supergraph  construction. 


refined  to  the  callee  and  then  restored  to  the  caller.  The 
rules  in  the  table  only  cover  the  case  where  the  state  passes 
through  a  function  argument. 

Global  variables  with  attached  state  are  not  affected  by 
the  refine  and  restore  operations.  File-scope  variables  will 
leave  scope  if  the  call  is  to  a  different  file.  One  important 
nuance  with  file-scope  variables  is  that  they  may  reenter 
scope  before  the  callee  returns  if  the  analysis  reaches  a  func¬ 
tion  further  down  the  call  chain  that  is  in  the  same  file  as 
the  original  caller.  For  this  reason,  file-scope  variables  are 
passed  across  the  function  boundary  but  they  are  temporar¬ 
ily  inactivated  (and,  thus,  ignored  by  the  analysis)  until  the 
analysis  returns  to  the  file  in  which  they  were  declared.  All 
state  attached  to  variables  and  expressions  that  are  local  to 
the  caller  is  saved  at  the  call  boundary,  deleted  from  the 
sm_instance  before  the  call  is  followed,  then  restored  to  the 
sm_instaiice  when  the  call  returns. 

6.2  Dynamic  programming  summaries 

This  subsection  describes  how  we  use  block  summaries 
to  build  additional  summaries  at  the  function  level  and  at 
the  suffix  level.  A  function  summary  stores  add  and  transi¬ 
tion  edges  that  summarize  how  an  entire  function  updates 
the  extension  state.  Function  summaries  are  used  to  repro¬ 
duce  the  effects  of  analyzing  a  function  when  a  cache  hit 
occurs  at  a  function  call  boundary.  Each  block,  6,  also  has 
a  suffix  summary  that  consists  of  add  and  transition  edges 
starting  at  b  and  ending  at  the  exit  point,  e^,  to  the  enclos¬ 
ing  function,  p.  A  function  summary  can  be  viewed  as  a 
suffix  summary  beginning  at  the  entry  block  Sp,  Cp’s  suffix 
summary  equals  its  block  summary. 

The  second  row  for  each  block  in  Figure  5  shows  the 
suffix  summary  for  that  block.  For  example,  the  summary 
in  block  10  says  that  if  the  analysis  reaches  that  block  in  the 
state  {start :  w  freed),  then  the  analysis  will  also  reach 
the  exit  block  in  the  state  {start,  v  :  w  freed).  Thus,  the 
transition  edge 

{start, V  :  w  freed)  — {start,  v  :  w  ^  freed) 
is  part  of  block  lO’s  suffix  summary.  Notice  that  none  of  the 
edges  in  the  suffix  summaries  end  in  a  tuple  containing  the 
stop  state.  These  edges  are  unnecessary  to  the  analysis. 

Suffix  summaries  are  necessary  because  distinct  SMs  can 
transition  to  the  same  state.  Thus,  an  extension  can  begin 
analyzing  a  function  call  in  a  new  state  so  that  there  is  no 
cache  hit  at  the  function  boundary,  but  still  have  a  cache  hit 
within  the  called  function.  A  common  example  occurs  when 
some  source  variable  v  is  killed  at  a  program  point  p  that 
it  reaches  in  two  different  states.  To  accurately  reflect  the 
effects  of  the  function  call  to  the  caller,  the  analysis  must 
recreate  the  effects  of  fully  analyzing  the  called  function. 
Suffix  summaries  provide  exactly  this  information. 

The  relax  function,  which  computes  the  suffix  sum¬ 
maries,  is  called  whenever  the  analysis  hits  the  end  of  an 
intraprocedural  path  or  the  analysis  aborts  a  path  because 
of  a  cache  hit.  Figure  6  gives  a  sketch  of  the  edge  com¬ 
putation  algorithm  in  the  relax  function.  The  code  walks 
backwards  through  the  list  of  blocks  on  the  current  path, 
stored  in  the  backtrace,  combining  the  edges  in  each  block 
summary  with  the  suffix  edges  of  the  subsequent  block  in 
the  backtrace.  Each  block  stores  the  set  of  suffix  edges  in 
the  fields  sfx.add  and  sfx-transition.  Initially,  all  block 
and  suffix  summaries  are  empty. 

More  specifically,  the  code  first  checks  if  the  current 


//  Propagate  addition  and  transition  edges  up  path, 
relax (backtrace)  i 
b  =  pop (backtrace ) ; 

//  Initialize  suffix  edges, 
if (is_exit„block(b))  { 

b.sfx.add  U”  b.blk^add; 

b.sfx^transition  U*  b.blk_transition; 

} 

foreach  prev  in  backtrace  { 

//  All  add  edges  propagate  backwards 
foreach  e  in  b.sfx^add 

//  Relabel  gstate  component  of  add  state  tuple 
foreach  s  in  prev. blk_ transit ion  where 

8. end. gstate  ■  e. start. gstate  { 

e*  =  e; 

e* .St art. gstate  *  s . start . gstate ; 
prev . sf x_add  U*  e  * ; 

} 

//  Transition  edges  can  descend  from  both  edge  types 
foreach  e  in  b.sfx.transition  < 

foreach  s  in  prev.blk_transition  where 
8. end  “  e. start 

prev . sf x^transition  U*  (s. start,  e.end); 
foreach  s  in  prev.blk^add  where  s.end  »  e. start 
prev.sfx_add  U**  (s. start,  e.end); 

} 

b  =  prev; 

} 

} 

Figure  6:  Pseudocode  for  the  summary  computation 


block,  b,  is  an  exit  block.  If  so,  it  adds  6’s  block  sum¬ 
mary  to  its  suffix  summary.  It  then  propagates  both  add 
and  transition  edges  in  6’s  suffix  summary  backwards  to  the 
previous  block’s  (prev’s)  suffix  summary.  This  backwards 
propagation  uses  the  block  summary  to  extend  the  length 
of  all  of  the  suffix  edges  in  6  by  one  block.  It  does  so  by 
creating  new  edges  from  the  start  point  of  a  block  summary 
edge  and  the  endpoint  of  a  suffix  summary  edge  and  adding 
these  extended  edges  to  prev’s  suffix  summary. 

For  a  suffix  add  edge,  Ca,  in  6,  the  algorithm  looks  for 
an  edge  in  prev’s  block  summary  whose  end  point  matches 
the  start  of  ea.  Recall  that  if  ea  adds  an  instance  attached 
to  the  program  object  p,  the  start  tuple  of  Ca  will  contain 
the  special  value  v  :  p  unknown.  Each  block  summary 
records  how  that  block  updates  the  global  instance  with  an 
edge  whose  endpoints  are  state  tuples  that  only  include  the 
global  instance  and  the  placeholder  <>.  For  the  purposes 
of  relaxation,  these  special  transition  edges  will  match  the 
initial  state  of  an  add  edge  if  the  values  of  the  global  in¬ 
stance  match.  For  a  suffix  transition  edge,  et,  the  algorithm 
looks  for  an  add  edge  or  transition  edge  in  prev’s  block 
summary  whose  end  tuple  is  equivalent  to  et’s  start  tuple. 
The  algorithm  stops  when  it  either  finishes  walking  over  the 
backtrace  or  when  no  new  edges  are  propagated  (i.e.,  the 
previous  block’s  summary  does  not  grow). 

The  input  to  our  algorithm  is  the  supergraph  for  the 
source  base,  which  is  defined  in  [18].  The  supergraph  is 
constructed  from  the  CFG  for  every  function  in  the  source 
base  with  the  following  modifications.  First,  the  algorithm 
adds  two  nodes  to  each  routine  p:  an  entry  node,  Sp,  and  an 
exit  node,  ep.  Second,  it  splits  calls  to  p  into  two  nodes:  a 
callsite  node,  Cp,  and  a  return-site  node,  Vp.  Finally,  it  adds 
two  directed  edges:  one  firom  Cp  to  Sp,  the  other  firom  ep  back 
to  rp.  The  supergraph  ensures  that  the  only  intraprocedural 
successor  of  Cp  is  rp. 


6  J  The  Top-Down  Algorithm  in  Detail 

The  top-down  algorithm  traverses  the  supergraph 
depth-first  starting  at  all  function  roots.  As  shown  in  Fig¬ 
ure  4,  when  a  function  call  is  encountered,  follow_call  is 
called  to  restart  the  traversal  at  the  entry  to  the  callee.  The 
routine  takes  the  sm^instance,  the  caller’s  backtrace,  the 
caller’s  AST,  the  callee’s  AST,  and  the  return-site  node, 
and  performs  the  following  operations: 

1.  Refines  the  extension  state  to  the  callee’s  scope  as  de¬ 
scribed  in  Section  6.1. 

2.  Calls  traverse _cfg  with  the  refined  sm.instance,  an 
empty  backtrace,  the  callee’s  AST,  and  the  callee’s 
entry  block. 

3.  Uses  the  callee’s  function  summary  to  compute  a  set, 
5,  of  transition  and  add  edges  that  apply  to  the  current 
extension  state. 

4.  Restores  the  edges  in  s  to  the  caller’s  context. 

5.  Creates  new  sm-instance  structures  for  each  disjoint 
exit  state.  The  sm-instance  can  only  assign  one 
state  value  to  each  instance  (in  both  gstate  and 
active.vars),  and  active.vars  can  only  contain  one 
instance  attached  to  a  particular  program  object. 
Thus,  s  is  partitioned  into  disjoint  sets,  each  of  which 
contains  edges  whose  global  instance  has  the  same 
value  and  whose  variable-specific  instances  are  all  at¬ 
tached  to  different  program  objects.  These  partitions 
are  used  to  construct  the  new  sm.instance  structures. 

6.  Uses  the  new  sm-instance  structures  to  analyze  the  re¬ 
mainder  of  the  caller  by  calling  traverse _cfg  on  each 
new  sm-instance,  the  backtrace  saved  at  the  callsite, 
the  caller’s  AST,  and  the  return  block  at  the  callsite. 

When  a  state  variable  instance  is  transitioned  to  the  sink 
state,  stop,  in  the  callee,  the  instance  should  be  deleted  from 
the  extension  state  when  the  analysis  returns  to  the  caller. 
Any  edges  that  end  with  a  tuple  containing  an  instance  in 
the  stop  state  are  omitted  from  the  function  summary.  Thus, 
steps  4-6  in  the  list  above  will  not  add  the  stopped  variable 
to  the  outgoing  extension  state. 

The  analysis  will  terminate  if  each  SM  within  an  exten¬ 
sion  reaches  a  final  state  after  a  finite  number  of  transitions 
from  every  state.  The  complexity  of  this  algorithm  is  similar 
to  that  in  [18].  Note  that  our  algorithm  has  an  implementa¬ 
tion  disadvantage  over  the  algorithms  in  [5,  18]  because  we 
may  analyze  any  given  function  at  several  different  points 
in  the  analysis  as  we  reach  a  call  to  that  function  in  differ¬ 
ent  states.  Thus,  we  cannot  free  the  storage  associated  with 
a  function  until  we  are  sure  that  it  will  not  be  analyzed 
again.  For  large  programs,  it  may  be  necessary  to  create 
compact  path  summaries  that  only  retain  those  portions  of 
the  AST  that  are  relevant  to  the  analysis.  This  has  not, 
however,  prevented  our  analysis  from  running  effectively  on 
the  Linux  kernel.  We  leave  this  computation  to  future  work. 

7.  UNSOUNDNESS 

The  strength  of  our  extensions  is  that  they  can  express 
many  rules  in  a  concise  way;  they  are  not  designed  to  express 
sound  analyses.  It  is  easy  for  extensions  to  make  approxima¬ 
tions  or  use  analyses  that  are  not  conservative.  Because  the 


extensions  are  not  intended  to  be  sound,  building  a  sound 
analysis  engine  is  a  misdirected  effort;  the  analysis  should 
instead  focus  on  executing  the  extensions  effectively. 

Metal  extensions  often  introduce  unsoundness  by  mak¬ 
ing  approximations  or  by  using  analysis  techniques  that  pro¬ 
duce  good  results  but  are  not  necessarily  correct.  For  exam¬ 
ple,  using  statistical  analysis  to  infer  which  routines  must  be 
paired  (such  as  lock  and  unlock)  is  an  effective  technique, 
but  cannot  guarantee  that  these  inferences  are  correct. 

The  interprocedural  analysis  algorithm  in  xgcc  is  un¬ 
sound  because  it  does  not  analyze  recursive  loops  conser¬ 
vatively,  and  it  does  not  analyze  value  flow  conservatively. 
When  a  function  cache  hit  occurs  during  a  recursive  loop, 
the  function  summary  may  not  be  complete.  The  conserva¬ 
tive  solution  is  to  assume  that  the  extension  could  be  in  any 
possible  state  after  the  cache  hit.  Instead,  our  algorithm 
assumes  that  the  existing  function  summary  is  sufficient. 

Our  approach  is  vulnerable  to  both  false  negatives  and 
false  positives.  False  negatives  occur  when  a  checker  fails 
to  warn  about  an  error  in  the  program.  For  certain  classes 
of  errors,  such  as  security  holes,  false  negatives  may  be  a 
serious  problem.  However,  even  here,  the  tradeoff  between 
soundness  and  unsoundness  at  a  practical  level  is  not  clear- 
cut.  Our  focus  on  expressiveness  means  that  we  can  easily 
check  many  security  properties.  As  a  result,  to  the  best  of 
our  knowledge,  we  are  able  to  find  more  security  holes  than 
sound  analyses  [1,  9,  10]. 

False  positives  present  a  different  problem:  if  a  checker’s 
warnings  are  often  wrong,  then  a  user  will  ignore  all  of  its 
warnings.  The  next  two  sections  discuss  how  we  counter 
false  positives  with  a  variety  of  lightweight  suppression  tech¬ 
niques  and  a  post-processing  ranking  step  that  tries  to  order 
the  rule  violations  that  we  report  such  that  the  most  impor¬ 
tant,  most  likely  violations  appear  first. 

In  an  ideal  world,  we  could  write  effective,  sound  anal¬ 
yses  to  check  every  program  rule  that  we  could  think  of. 
Unfortunately,  it  is  well  known  that  it  is  infeasible  to  prove 
programs  correct,  so  it  is  unlikely  that  we  will  ever  approach 
this  goal.  Thus,  the  ideal  approach  is  one  that  is  sound  when 
it  can  be  and  unsound  where  the  sound  approach  fails.  Our 
approach  explores  the  benefits  and  uses  of  unsoundness. 

Program  rules  fall  into  equivalence  classes  where  a  vi¬ 
olation  of  one  rule  is  no  less  or  more  important  than  a  vi¬ 
olation  of  another.  Common  classes  include  the  set  of  all 
exploitable  security  holes  or  nondeterministic  bugs.  In  such 
cases,  finding  1000  bugs  of  a  given  class  is  more  important 
than  all  10  violations  of  a  single  rule  in  that  class.  It  is 
the  observable  behavior  of  the  program  that  actually  mat¬ 
ters,  not  its  behavior  with  respect  to  any  of  these  properties. 
The  observable  behavior  will  not  be  correct  until  all  of  the 
bugs  in  the  system  are  fixed.  We  are  essentially  making  an 
end-to-end  argument  [20]:  it  makes  little  sense  to  expend 
significant  resources  reducing  the  error  rate  of  one  part  of  a 
system  below  the  residual  error  rate  of  the  other  parts.  An 
unsound  analysis  that  finds  more  bugs  improves  the  end-to- 
end  behavior  of  the  system  more  than  a  sound  analysis  that 
finds  fewer  bugs. 

8.  FALSE  POSITIVE  SUPPRESSION 

Static  analyses  can  make  approximations  that  lead  to  in¬ 
correct  error  reports  (false  positives).  This  section  describes 
our  main  techniques  for  false  positive  suppression. 

Killing  variables  and  expressions.  Whenever  a  vari- 


able  is  defined,  xgcc  iterates  through  the  list  of  program 
objects  with  attached  state  and  determines  if  the  defined 
variable  is  used  within  any  of  these  objects.  If  so,  the  ob¬ 
ject  is  transitioned  to  the  stop  state,  thereby  deleting  the 
corresponding  state  variable  instance.  In  Figure  2,  xgcc  au¬ 
tomatically  transitions  the  variable  p  from  the  freed  state  to 
the  stop  state  at  the  assignment,  “p  =  0,”  at  line  8.  The 
assignment  case  is  obvious;  the  slightly  more  subtle  case  is 
that  an  expression  (e.g.,  a[i])  with  attached  state  is  transi¬ 
tioned  to  the  stop  state  when  a  component  of  that  expression 
(e.g,,  i)  is  redefined.  This  analysis  runs  transparently  unless 
a  checker  requests  otherwise,  and  it  is  the  single  most  im¬ 
portant  technique  for  suppressing  false  positives  in  checkers 
that  attach  state  to  specific  program  objects. 

Synonyms.  If  a  variable  tracked  by  an  extension  is  as¬ 
signed  to  another  variable,  both  variables  become  synonyms: 
state  changes  in  one  are  mirrored  in  the  other.  For  exam¬ 
ple,  since  p  and  q  are  equal  in  the  following  code  fragment, 
a  successful  check  that  p  is  not  null  also  implies  that  q  is 
not  null  at  the  dereference: 

p  *=  q  “  kmalloc  (...); 
if(!p) 

return  0; 

♦q;  /*  safe  dereference:  q  =  p  =  not  null  */ 

We  implemented  synonyms  with  a  50  line  addition  to  our 
system.  In  addition  to  reducing  false  positives,  synonyms 
aJso  increase  coverage  by  increasing  the  number  of  variables 
with  an  attached  state.  In  Figure  2,  the  assignment  on  line 
7  allows  the  analysis  to  catch  the  error  on  line  12. 

False  path  pruning.^  Nonexecutable  “false  paths” 
caused  by  data  dependencies  are  another  source  of  false- 
positives.  xgcc's  simple  path-sensitive  analysis  uses  basic 
value  tracking  combined  with  a  congruence  closure  algo¬ 
rithm  to  prune  infeasible  paths.  In  Figure  2,  because  the 
conditions  on  lines  4  and  10  are  contradictory,  there  are  only 
two  executable  paths  through  the  function  contrived,  not 
four,  xgcc's  algorithm  will  prune  the  two  infeasible  paths. 
The  algorithm  executes  the  following  steps: 

1.  We  track  all  variable  assignments  and  comparisons,  ei¬ 
ther  to  constants  (e.g,,  x  =  10,  x  <  100)  or  to  other 
variables  (e.g.,  y  =  x,  x  <  y).  For  each  assignment  to 
a  variable,  we  assign  a  new  name  to  that  variable  so 
that  different  definitions  of  the  variable  are  not  con¬ 
fused.  If  we  see  the  statement  (x  <  y) ,  we  record  that 
X  <  y  holds  along  the  true  branch  and  x  >=  y  holds 
along  the  false  branch. 

2.  When  we  see  an  expression  (e.g.,  y  =  x  +  l),  we  try 
to  evaluate  the  expression  based  on  what  we  already 
know.  If  we  know  that  x  is  10,  then  we  will  assign  y 
the  value  11.  If  we  know  nothing  about  x,  we  store  the 
entire  expression. 

3.  If  we  see  a  loop,  we  set  the  value  of  all  variables  defined 
in  the  loop  to  “unknown”  after  the  loop  body.  This 
step  eliminates  the  need  to  unroll  loops. 

4.  We  infer  which  variables  must  have  the  same  value 
through  the  =,  — ,  and  !=  operators  and  place  them 

^Note  that  the  algorithm  described  here  was  implemented 
in  a  previous  version  of  xgcc.  We  have  not  yet  ported  it  to 
the  current  version  with  interprocedural  analysis. 


into  a  single  equivalence  class.  Using  a  congruence  clo¬ 
sure  algorithm  [8],  we  then  derive  as  many  equalities 
and  non-equalities  as  possible  from  the  list  of  tracked 
assignments.  If  an  equivalence  class  contains  a  con¬ 
stant,  we  know  the  exact  value  of  everything  in  that 
equivalence  class.  If  not,  using  the  tracked  inequal¬ 
ities  we  can  derive  relationships  between  equivalence 
classes.  For  example,  if  x  <  y  holds,  then  everything 
in  x’s  equivalence  class  is  smaller  than  everything  in 
y’s  equivalence  class. 

5.  When  the  extension  reaches  a  branch  in  the  CFG,  we 
first  check  if  the  branch  condition  is  a  comparison  be¬ 
tween  an  expression  and  a  constant  and  we  know  the 
value  of  the  expression.  If  so,  we  evaluate  the  condition 
and  prune  the  false  path.  If  not,  we  look  through  the 
list  of  relations  between  congruence  classes.  If  there  is 
a  relationship  that  either  contradicts  or  confirms  the 
branch  condition,  we  prune  the  true  or  false  path.  Oth¬ 
erwise,  we  assume  both  paths  are  possible. 

6.  If  a  path  is  pruned,  we  remove  all  block  summary  en¬ 
tries  that  were  inserted  while  analyzing  the  pruned 
path  so  that  the  summaries  at  each  block  do  not  con¬ 
tain  any  non-reachable  state  tuples. 

Our  algorithm  is  scalable  because  it  does  not  track  values 
or  evaluate  branches  too  precisely.  The  justification  for  this 
choice  is  that  most  paths  are  executable  and  most  data  de¬ 
pendencies  are  simple.  Complex  data  dependencies  are  dif¬ 
ficult  for  programmers  to  understand,  so  they  avoid  them  as 
bad  practice. 

Targeted  suppression  of  false  positives.  One  com¬ 
mon  cause  of  false  positives  is  a  conflict  between  an  idiomatic 
code  sequence  and  an  analysis  approximation.  Metal  makes 
it  easy  for  an  extension  to  suppress  these  system-specific  id¬ 
ioms.  In  some  cases,  this  conflict  is  an  indication  that  the 
approximation  is  too  coarse  and  a  more  thorough  analysis  is 
appropriate;  in  other  cases,  it  is  best  to  suppress  the  prob¬ 
lematic  sequence  directly. 

A  conservative  version  of  the  free  checker  that  flags  all 
uses  of  freed  variables  as  errors  is  a  good  example.  The 
false  positives  for  this  checker  came  from  two  sources:  (1) 
passing  a  freed  pointer  to  a  debugging  function  that  prints 
the  pointer,  and  (2)  in  BSD,  passing  the  addresses  of  freed 
variables  to  functions  that  redefine  them.  We  added  eight 
lines  of  code  to  the  checker  to  suppress  both  classes  of  false 
positives. 

History.  Initially,  we  worried  that  after  the  errors  we 
reported  were  fixed,  we  would  only  detect  false  positives  in 
newer  versions  that  would  require  heavyweight  techniques 
to  eliminate.  A  simple  alternative  is  to  just  remember  false 
positives  from  past  versions  and  suppress  them  in  future  ver¬ 
sions.  We  match  error  reports  across  versions  by  comparing 
file  name,  function  name,  variable  names  involved  in  the 
analysis,  and  the  actual  error  itself  as  stated  by  the  checker. 
These  fields  are  relatively  invariant  under  edits  (unlike,  for 
example,  line  numbers)  and  seem  to  work  well  in  practice. 

9.  RANKING 

Given  ten  errors,  you  can  inspect  all  of  them.  Given 
1000  errors,  you  cannot.  An  effective  bug-finding  approach 
will  report  100s  or  1000s  of  errors  in  a  real  system.  The  ideal 


error  ranking  will  rank  all  true  error  reports  before  false  er¬ 
ror  reports,  and  it  will  order  the  true  error  reports  according 
to  the  severity  of  each  bug.  We  try  to  approximate  the  ideal 
ranking  by  first  stratifying  errors  based  on  their  severity, 
then  sorting  within  each  class  based  on  both  the  probability 
of  the  error  being  a  false  positive  and  the  difficulty  of  in¬ 
spection.  The  user  can  then  start  with  the  most  important 
class,  inspect  within  that  class  until  the  false  positive  rate 
is  too  high  or  inspection  requires  too  much  effort,  and  skip 
to  the  next  class  of  errors.^ 

Prom  our  experience  with  Linux  and  BSD,  implementers 
almost  always  fix  errors  that  are  difficult  to  diagnose  with 
testing  first.  These  include  use-after-free  errors,  missing  lock 
releases,  and  security  holes.  We  rank  these  errors  over  those 
that  are  easier  to  diagnose  with  testing,  such  as  memory 
allocation  failures. 

We  also  group  all  errors  that  are  computed  from  a  com¬ 
mon  analysis  fact  into  the  same  class.  For  example,  all  use- 
after-free  errors  that  involve  the  same  freeing  function  are 
placed  in  the  same  class.  Such  grouping  makes  it  easy  to 
suppress  them  all  if  the  analysis  is  wrong. 

Generic  ranking.  By  default,  our  system  sorts  error 
messages  using  the  following  criteria: 

1.  Distance.  Errors  that  span  hundreds  of  lines  are  more 
difficult  to  diagnose  than  those  that  span  a  few.  We 
rank  based  on  the  distance  between  the  statement  that 
contains  the  error  and  the  statement  where  the  exten¬ 
sion  started  checking  the  property  that  led  to  the  error. 

2.  Number  of  conditionals.  The  more  conditionals  an  er¬ 
ror  spans,  the  harder  it  is  to  diagnose  and  the  more 
likely  it  is  to  be  a  false  path.  Each  conditional  is  arbi¬ 
trarily  weighted  as  ten  lines  of  distance. 

3.  Degree  of  indirection.  We  rank  errors  that  use  syn¬ 
onyms  below  those  that  do  not;  the  former  are  more 
difficult  to  inspect.  We  then  sort  synonyms  based  on 
the  length  of  the  assignment  chain. 

4.  Local  versus  interprocedural.  Local  errors  can  take 
seconds  to  diagnose,  whereas  interprocedural  errors 
can  take  minutes.  We  rank  all  local  errors  over  global 
ones  and  then  order  global  errors  based  on  the  length 
of  the  shortest  call  chain  that  causes  the  error. 

The  latter  two  criteria  partition  error  messages  into  different 
classes,  which  are  then  sorted  using  the  first  two  criteria. 

In  dealing  with  Linux  and  OpenBSD  implementers,  we 
have  observed  a  curious  phenomenon:  given  errors  of  equal 
importance,  the  more  analysis  required  to  find  an  error,  the 
lower  the  error  should  be  ranked.  As  the  number  of  analysis 
steps  increases,  the  likelihood  that  an  analysis  approxima¬ 
tion  made  a  mistake  and  the  manual  inspection  effort  both 
increase.  Thus,  these  error  reports  are  more  likely  to  be 
false  positives  and  more  difficult  to  diagnose. 

Checker-specific  and  system-specific  ranking. 
The  domain  knowledge  that  allows  an  extension  to  check  a 
rule  also  helps  it  to  rank  errors  more  effectively  by  gathering 
checker-specific  or  system-specific  information.  We  mostly 
use  checker-specific  ranking  to  (1)  rank  errors  by  severity 
and  (2)  perform  targeted  demotion  of  errors. 

^Prom  informal  discussions  with  the  PREfix  implementers, 
this  strategy  and  many  of  the  ranking  rules  in  this  section 
have  similarities  to  those  that  they  use. 


Many  extensions  are  composed  with  a  simple  extension 
that  annotates  paths  that  can  be  triggered  by  the  user  (us¬ 
ing  the  string  SECURITY)  and  paths  that  are  likely  to  be  error 
paths  (using  the  string  ERROR).  Errors  on  the  first  type  of 
path  pose  security  risks,  since  they  can  be  triggered  by  the 
user.  Errors  on  the  second  are  empirically  more  likely  to  be 
real  errors,  in  part  because  error  paths  are  less  tested.  The 
extension  can  also  add  these  two  annotations  and  the  ad¬ 
ditional  annotation  MINOR  manually.  Errors  annotated  with 
SECURITY  are  ranked  highest,  those  annotated  with  ERROR 
are  ranked  next,  and  those  annotated  with  MINOR  are  ranked 
last. 

Statistical  ranking.  Our  most  novel  ranking  method 
uses  statistical  analysis.  We  have  observed  that  an  analysis 
mistake  often  leads  to  a  local  explosion  of  error  reports.  The 
most  reliable  rules  are  followed  many  times  and  violated 
rarely.  We  can  use  statistical  analysis  to  sort  errors  based 
on  these  numbers. 

An  earlier  version  of  the  free  checker  used  a  fiow- 
insensitive,  interprocedural  analysis  to  compute  a  list  of  all 
functions  that  freed  their  arguments  or  passed  an  argument 
to  a  function  that  did.  It  would  then  run  a  local  pass  that 
used  this  list  to  find  errors.  The  checker  had  an  enormous 
number  of  false  positives,  most  due  to  a  single  limitation 
of  our  analysis:  a  small  number  of  functions  only  freed  one 
argument  based  on  the  value  of  another  argument,  but  our 
analysis  decided  that  these  functions  always  freed  their  ar¬ 
gument.  Thus,  rather  than  having  an  error  rate  of  one  error 
per  few  hundred  callsites,  these  functions  had  rates  closer 
to  fifty  errors  per  hundred  callsites.  When  we  sorted  errors 
based  on  these  rates,  all  of  the  real  errors  went  to  the  top 
and  the  errors  caused  by  functions  the  analysis  could  not 
handle  were  pushed  to  the  bottom. 

We  rank  errors  based  on  the  reliability  of  the  rules  that 
caused  them  using  the  z-statistic  for  proportions.  The  z- 
statistic  evaluates  the  hypothesis  that  an  outcome  that  oc¬ 
curs  e  times  out  of  n  is  consistent  with  an  expected  prob¬ 
ability,  po,  for  that  outcome.  We  compute  the  z-statistic 
as 

z{n,  e)  =  (e/n  -  po)/ \/(po  *  (1 -po))/n 

Our  null  hypothesis  is  that  a  rule  is  obeyed  or  violated 
at  random.  In  this  case,  we  expect  half  of  all  checks  to  be 
successful  and  half  of  all  checks  to  fail,  hence  po  ==  0.5.  If 
a  rule  is  obeyed  at  random,  that  rule  is  probably  incorrect. 
Conversely,  if  a  rule  is  almost  always  followed,  that  rule  is 
probably  correct. 

We  count  the  number  of  times  the  rule  was  followed 
(or  examples)  as  e  and  the  number  of  rule  violations  (or 
counterexamples)  as  c.  The  total  number  of  events,  n,  is 
the  sum  of  e  and  c. 

The  larger  the  computed  value  of  the  z-statistic,  the 
higher  the  significance  level  at  which  we  can  reject  the  null 
hypothesis.  High  values  indicate  a  higher  probability  that 
the  counterexamples  found  are  indeed  violations  of  a  valid 
rule,  and  are,  therefore,  most  likely  errors. 

For  the  free  checker  above,  each  freeing  function  defines 
its  own  rule.  That  rule  is  violated  when  an  error  is  reported 
on  a  pointer  passed  to  that  function  (c).  The  rule  is  followed 
when  a  pointer  passed  to  that  function  is  never  touched 
again  (e). 

Ranking  code.  If  a  particular  block  of  code  causes 
an  explosion  of  errors,  the  analysis  probably  cannot  handle 


some  aspect  of  that  code.  We  first  applied  this  observation 
to  an  intraprocedural  lock  checker  that  flagged  when  calls 
to  a  locking  function  did  not  have  a  matching  unlock.  The 
major  source  of  false  positives  for  this  extension  was  wrap¬ 
per  functions  that  either  always  acquired  or  always  released 
locks.  In  this  case,  the  locking  rule  is  context-dependent;  in 
some  contexts  the  rule  is  correct,  in  some  contexts  it  is  not. 

When  each  function  is  analyzed,  we  set  e  to  the  num¬ 
ber  of  times  the  function  correctly  acquired  and  released 
locks  and  c  to  the  number  of  mismatched  pairs.  The  high¬ 
est  ranked  functions  had  a  large  number  of  successful  ac¬ 
quire/release  pairs  with  only  a  few  errors.  These  functions 
are  exactly  the  ones  that  most  likely  contain  errors. 

Interprocedural  analysis  would  solve  this  particular 
problem,  but  all  analysis  has  limits.  For  example,  if  we  ex¬ 
tend  the  locking  analysis  to  include  the  Linux  semaphore 
routines  up  and  down,  there  will  be  a  high  rate  of  false 
positives  since  semaphores  are  sometimes  used  as  counters, 
which  need  not  be  paired,  and  sometimes  as  locks,  which 
must  be  paired.  Ranking  easily  distinguishes  these  two  dif¬ 
ferent  uses,  whereas  adding  interprocedural  analysis  will  not. 

Discussion.  Statistical  ranking  can  also  be  used  to  in¬ 
fer  the  severity  or  likelihood  of  real  errors.  The  most  serious 
or  most  hkely  errors  tend  to  violate  rules  that  are  almost  al¬ 
ways  followed.  Thus,  ranking  is  useful  even  for  an  approach 
that  does  not  report  any  false  positives.  Sound  approaches 
should  find  ranking  especially  useful  because  conservative 
assumptions  often  lead  to  large  numbers  of  false  positives.  In 
our  experience,  false  positives  are  not  randomly  distributed 
but  often  come  from  a  small  set  of  analysis  mistakes  that 
are  automatically  identified  with  ranking. 

Ranking  can  be  a  simple  technique,  but,  from  the  error- 
inspector’s  point  of  view,  it  makes  an  exhilarating  difference. 

10.  RELATED  WORK 

In  this  section,  we  discuss  other  systems  for  finding  bugs 
in  C  programs.  We  divide  these  systems  into  those  that  re¬ 
quire  programmer  annotations  and  those  that  do  not.  Most 
of  the  systems  discussed  here  are  sound  whereas  our  system 
is  not.  We  focus  on  checking  a  broad  class  of  properties  that 
are  either  difficult  or  impossible  to  specify  soundly.  Because 
metal  is  flexible,  we  believe  that  our  system  can  check  a 
wider  variety  of  properties  with  a  wider  variance  in  precision 
than  any  other  systems  with  similar  goals.  The  discussion 
below  focuses  on  other  differences  between  our  system  and 
other  static  bug-finding  tools. 

10.1  Bug-Finding  Without  Annotations 

ESP  [5]  is  the  project  most  similar  in  spirit  to  our  own. 
Properties  are  specified  in  ESP  using  a  state  machine  lan¬ 
guage  similar  to  metal.  These  properties  are  then  verified 
using  a  sound,  interprocedural  dataflow  analysis  based  on 
the  RHS  algorithm  [18].  ESP  includes  the  “abstract  sim¬ 
ulation”  algorithm,  which  is  an  interprocedural  false-path 
pruning  algorithm.  Our  false-path  pruning  algorithm  uses 
a  congruence  closure  algorithm  which,  in  the  intraprocedu¬ 
ral  case,  is  more  powerful  than  the  algorithm  actually  used 
in  ESP.  The  ESP  approach  is  more  likely  to  scale  in  the 
interprocedural  case  than  ours. 

The  SLAM  project  [2]  aims  to  verify  temporal  safety 
properties  by  using  a  combination  of  predicate  abstrac¬ 
tion  [15],  model  checking  [4],  and  predicate  discovery.  Our 
approach  and  the  SLAM  approach  have  different  goals: 


SLAM  is  a  verification  tool  intended  for  small,  bug-prone 
pieces  of  larger  systems.  It  is  effective  within  these  scalabil¬ 
ity  limits.  Our  approach  is  intended  for  large  systems. 

Intrinsa’s  PREfix  [3]  is  an  industrial-strength  tool  for  C 
that  performs  symboUc  evaluation  of  interprocedural  execu¬ 
tion  paths  while  looking  for  errors  such  as  uses  of  uninitial¬ 
ized  memory,  buffer  overflows,  NULL-pointer  dereferences, 
and  memory  leaks.  PREfix  works  on  large  software  systems. 
It  does  a  deeper,  more  expensive  analysis  than  our  system 
by  building  a  memory  model  along  each  execution  path  in 
the  program.  However,  it  only  finds  a  fixed  set  of  error 
types  using  a  fixed  set  of  analyses.  We  allow  programmers 
to  extend  both. 

10.2  Bug-Finding  With  Annotations 

There  are  many  annotation-based  checking  projects. 
One  of  the  most  developed  is  Extended  Static  Checking  [7] 
(ESC)  and  ESC/ Java  [19],  which  are  annotation-based  tools 
that  use  a  theorem  prover  to  find  errors.  The  annotations 
used  by  ESC  allow  for  varying  levels  of  detail,  which  lets  the 
annotator  balance  annotation  effort,  completeness,  and  ver¬ 
ification  time.  For  basic  programming  errors,  the  reported 
annotation  overhead  for  ESC  ran  as  high  as  one  annotation 
per  three  lines  of  code  for  small  programs.  Recent  work 
on  inferring  these  annotations  attempts  to  reduce  this  bur¬ 
den  [12].  Because  of  this  effort,  they  run  on  small  code  bases, 
and  find  relatively  few  bugs  compared  to  our  approach. 

LCLint  [11]  statically  and  unsoundly  checks  C  programs 
with  the  aid  of  programmer  annotations,  LCLint  requires 
additional  annotations  to  improve  precision,  which  leads  to 
a  significant  annotation  burden  to  produce  useful  results. 

Cqual  [14]  is  an  annotation-based  approach  that  adds 
flow-sensitive  qualifiers  to  standard  C  types.  The  analysis 
is  interprocedural  and  sound.  However,  it  requires  annota¬ 
tions  both  to  express  program  properties  and  to  suppress 
false  positives  caused  by  conservative  aliasing  assumptions. 
These  annotations  are  a  significant  practical  drawback. 

In  general,  bug-finding  techniques  that  rely  on  annota¬ 
tions  require  strenuous,  invasive  code  modifications.  This 
annotation  overhead  can  be  prohibitive  for  large  systems. 
One  of  the  most  rigorous  measurements  of  this  overhead 
comes  from  Flanagan  and  Freund  who  measured  an  annota¬ 
tion  overhead  of  one  annotation  per  50  lines  of  code  at  a  cost 
of  one  programmer  hour  per  thousand  lines  of  code  [13].  For 
a  system  the  size  of  Linux  (2MLOC),  this  would  require  two 
spells  of  40  days  and  40  nights  of  continuous  annotating  for 
a  single  property!  In  contrast,  once  the  fixed  cost  of  writing 
a  metal  extension  is  paid  (often  a  day  or  so)  there  is  little 
incremental  cost  to  applying  it  to  a  large  amount  of  code. 

10.3  Language-based  approaches 

We  view  language-based  approaches  to  preventing  bugs 
as  largely  complementary  to  our  work.  The  Vault  [6]  lan¬ 
guage  lets  users  specify  typestate  properties  within  the  lan¬ 
guage;  the  role  analysis  concept  proposed  in  [17]  can  specify 
even  more  complex  properties  by  providing  a  language  mech¬ 
anism  for  specifying  legal  aliasing  relationships.  Programs 
written  correctly  using  these  languages  would  be  protected 
from  some  of  the  bugs  that  we  find. 

Tool-based  analysis,  however,  does  have  some  signifi¬ 
cant  practical  advantages.  First,  our  statistical  extensions 
can  automatically  infer  some  of  the  temporal  properties  that 
a  languages  like  Vault  requires  programmers  to  manually 


specify  [10],  Tools  can  also  transparently  check  properties 
without  requiring  the  use  of  a  specific  language  for  code  con¬ 
struction  or  rewrites.  Language  adoption  has  historically 
been  an  erratic  process.  Tools  work  immediately. 

11.  CONCLUSION 

This  paper  describes  a  language  for  specifying  program 
properties,  metal,  and  an  analysis  engine  for  checking  these 
properties  statically,  xgcc,  that  has  found  thousands  of  bugs 
in  real  source  code.  The  approach  we  present  centers  around 
a  single  goal:  to  find  as  many  bugs  in  real  systems  as  pos¬ 
sible.  Metal  and  xgcc  are  designed  to  support  this  goal. 
One  implication  of  our  work  is  that  finding  bugs  is  easy 
given  the  right  approach.  We  present  one  possible  approach 
that  centers  around  extensibility.  An  extensible  specifica¬ 
tion  language  can  express  a  broad  class  of  properties  within 
a  single  framework.  Expressiveness  in  this  language  must  be 
matched  with  an  efficient  algorithm  that  does  not  impose  too 
many  restrictions  on  the  analyses  it  executes.  Thus,  extensi¬ 
bility  is  a  system-wide  property.  We  believe  that  metal  and 
xgcc  are  a  reasonable  step  towards  building  such  a  system. 
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